Overview

Dataset statistics

Number of variables16
Number of observations108035
Missing cells422673
Missing cells (%)24.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.2 MiB
Average record size in memory128.0 B

Variable types

Categorical3
Numeric13

Alerts

StationId has a high cardinality: 110 distinct values High cardinality
Date has a high cardinality: 2009 distinct values High cardinality
PM2.5 is highly correlated with PM10 and 4 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 6 other fieldsHigh correlation
NO is highly correlated with PM2.5 and 4 other fieldsHigh correlation
NO2 is highly correlated with PM2.5 and 4 other fieldsHigh correlation
NOx is highly correlated with PM10 and 3 other fieldsHigh correlation
NH3 is highly correlated with PM2.5 and 2 other fieldsHigh correlation
CO is highly correlated with PM10 and 1 other fieldsHigh correlation
Benzene is highly correlated with Toluene and 1 other fieldsHigh correlation
Toluene is highly correlated with Benzene and 1 other fieldsHigh correlation
Xylene is highly correlated with Benzene and 1 other fieldsHigh correlation
AQI is highly correlated with PM2.5 and 6 other fieldsHigh correlation
PM2.5 is highly correlated with PM10 and 1 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 3 other fieldsHigh correlation
NO is highly correlated with PM10 and 2 other fieldsHigh correlation
NO2 is highly correlated with NO and 1 other fieldsHigh correlation
NOx is highly correlated with PM10 and 2 other fieldsHigh correlation
Toluene is highly correlated with XyleneHigh correlation
Xylene is highly correlated with TolueneHigh correlation
AQI is highly correlated with PM2.5 and 1 other fieldsHigh correlation
PM2.5 is highly correlated with PM10 and 1 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 1 other fieldsHigh correlation
NO is highly correlated with NOxHigh correlation
NO2 is highly correlated with NOxHigh correlation
NOx is highly correlated with NO and 1 other fieldsHigh correlation
Benzene is highly correlated with Toluene and 1 other fieldsHigh correlation
Toluene is highly correlated with Benzene and 1 other fieldsHigh correlation
Xylene is highly correlated with Benzene and 1 other fieldsHigh correlation
AQI is highly correlated with PM2.5 and 1 other fieldsHigh correlation
PM2.5 is highly correlated with PM10 and 2 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 3 other fieldsHigh correlation
NO is highly correlated with NO2 and 1 other fieldsHigh correlation
NO2 is highly correlated with NO and 1 other fieldsHigh correlation
NOx is highly correlated with PM10 and 2 other fieldsHigh correlation
CO is highly correlated with AQIHigh correlation
Benzene is highly correlated with TolueneHigh correlation
Toluene is highly correlated with BenzeneHigh correlation
AQI is highly correlated with PM2.5 and 3 other fieldsHigh correlation
AQI_Bucket is highly correlated with PM2.5 and 2 other fieldsHigh correlation
PM2.5 has 21625 (20.0%) missing values Missing
PM10 has 42706 (39.5%) missing values Missing
NO has 17106 (15.8%) missing values Missing
NO2 has 16547 (15.3%) missing values Missing
NOx has 15500 (14.3%) missing values Missing
NH3 has 48105 (44.5%) missing values Missing
CO has 12998 (12.0%) missing values Missing
SO2 has 25204 (23.3%) missing values Missing
O3 has 25568 (23.7%) missing values Missing
Benzene has 31455 (29.1%) missing values Missing
Toluene has 38702 (35.8%) missing values Missing
Xylene has 85137 (78.8%) missing values Missing
AQI has 21010 (19.4%) missing values Missing
AQI_Bucket has 21010 (19.4%) missing values Missing
Benzene is highly skewed (γ1 = 21.61702016) Skewed
NOx has 4776 (4.4%) zeros Zeros
CO has 7280 (6.7%) zeros Zeros
Benzene has 12602 (11.7%) zeros Zeros
Toluene has 10455 (9.7%) zeros Zeros
Xylene has 6083 (5.6%) zeros Zeros

Reproduction

Analysis started2021-12-26 17:02:29.205525
Analysis finished2021-12-26 17:02:54.278279
Duration25.07 seconds
Software versionpandas-profiling v3.1.1
Download configurationconfig.json

Variables

StationId
Categorical

HIGH CARDINALITY

Distinct110
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size844.1 KiB
UP012
 
2009
TN001
 
2009
DL021
 
2009
DL008
 
2009
KA009
 
2009
Other values (105)
97990 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters540175
Distinct characters29
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAP001
2nd rowAP001
3rd rowAP001
4th rowAP001
5th rowAP001

Common Values

ValueCountFrequency (%)
UP0122009
 
1.9%
TN0012009
 
1.9%
DL0212009
 
1.9%
DL0082009
 
1.9%
KA0092009
 
1.9%
TN0032009
 
1.9%
KA0032009
 
1.9%
DL0072009
 
1.9%
DL0332009
 
1.9%
DL0132009
 
1.9%
Other values (100)87945
81.4%

Length

2021-12-26T22:32:54.368643image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
up0122009
 
1.9%
dl0332009
 
1.9%
mh0052009
 
1.9%
up0142009
 
1.9%
gj0012009
 
1.9%
tn0042009
 
1.9%
dl0132009
 
1.9%
tn0012009
 
1.9%
dl0072009
 
1.9%
ka0032009
 
1.9%
Other values (100)87945
81.4%

Most occurring characters

ValueCountFrequency (%)
0171293
31.7%
148475
 
9.0%
L47246
 
8.7%
D47223
 
8.7%
323227
 
4.3%
221884
 
4.1%
T15544
 
2.9%
A14911
 
2.8%
414632
 
2.7%
K13572
 
2.5%
Other values (19)122168
22.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number324105
60.0%
Uppercase Letter216070
40.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L47246
21.9%
D47223
21.9%
T15544
 
7.2%
A14911
 
6.9%
K13572
 
6.3%
G10761
 
5.0%
P10022
 
4.6%
H9808
 
4.5%
R8598
 
4.0%
B7064
 
3.3%
Other values (9)31321
14.5%
Decimal Number
ValueCountFrequency (%)
0171293
52.9%
148475
 
15.0%
323227
 
7.2%
221884
 
6.8%
414632
 
4.5%
513083
 
4.0%
88417
 
2.6%
68350
 
2.6%
78231
 
2.5%
96513
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
Common324105
60.0%
Latin216070
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
L47246
21.9%
D47223
21.9%
T15544
 
7.2%
A14911
 
6.9%
K13572
 
6.3%
G10761
 
5.0%
P10022
 
4.6%
H9808
 
4.5%
R8598
 
4.0%
B7064
 
3.3%
Other values (9)31321
14.5%
Common
ValueCountFrequency (%)
0171293
52.9%
148475
 
15.0%
323227
 
7.2%
221884
 
6.8%
414632
 
4.5%
513083
 
4.0%
88417
 
2.6%
68350
 
2.6%
78231
 
2.5%
96513
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII540175
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0171293
31.7%
148475
 
9.0%
L47246
 
8.7%
D47223
 
8.7%
323227
 
4.3%
221884
 
4.1%
T15544
 
2.9%
A14911
 
2.8%
414632
 
2.7%
K13572
 
2.5%
Other values (19)122168
22.6%

Date
Categorical

HIGH CARDINALITY

Distinct2009
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size844.1 KiB
2020-03-19
 
108
2020-06-17
 
108
2020-04-10
 
108
2020-06-24
 
108
2020-04-12
 
108
Other values (2004)
107495 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1080350
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017-11-24
2nd row2017-11-25
3rd row2017-11-26
4th row2017-11-27
5th row2017-11-28

Common Values

ValueCountFrequency (%)
2020-03-19108
 
0.1%
2020-06-17108
 
0.1%
2020-04-10108
 
0.1%
2020-06-24108
 
0.1%
2020-04-12108
 
0.1%
2020-04-27108
 
0.1%
2020-06-28108
 
0.1%
2020-06-16108
 
0.1%
2020-04-05108
 
0.1%
2020-05-27108
 
0.1%
Other values (1999)106955
99.0%

Length

2021-12-26T22:32:54.438065image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-03-19108
 
0.1%
2020-05-03108
 
0.1%
2020-03-13108
 
0.1%
2020-06-08108
 
0.1%
2020-06-10108
 
0.1%
2020-04-23108
 
0.1%
2020-06-06108
 
0.1%
2020-04-16108
 
0.1%
2020-05-20108
 
0.1%
2020-05-04108
 
0.1%
Other values (1999)106955
99.0%

Most occurring characters

ValueCountFrequency (%)
0260693
24.1%
-216070
20.0%
2191188
17.7%
1179627
16.6%
950830
 
4.7%
844185
 
4.1%
732023
 
3.0%
630810
 
2.9%
528358
 
2.6%
325989
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number864280
80.0%
Dash Punctuation216070
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0260693
30.2%
2191188
22.1%
1179627
20.8%
950830
 
5.9%
844185
 
5.1%
732023
 
3.7%
630810
 
3.6%
528358
 
3.3%
325989
 
3.0%
420577
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
-216070
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1080350
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0260693
24.1%
-216070
20.0%
2191188
17.7%
1179627
16.6%
950830
 
4.7%
844185
 
4.1%
732023
 
3.0%
630810
 
2.9%
528358
 
2.6%
325989
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII1080350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0260693
24.1%
-216070
20.0%
2191188
17.7%
1179627
16.6%
950830
 
4.7%
844185
 
4.1%
732023
 
3.0%
630810
 
2.9%
528358
 
2.6%
325989
 
2.4%

PM2.5
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct22395
Distinct (%)25.9%
Missing21625
Missing (%)20.0%
Infinite0
Infinite (%)0.0%
Mean80.27257135
Minimum0.02
Maximum1000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:54.528342image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.02
5-th percentile13.02
Q131.88
median55.95
Q399.92
95-th percentile236.5655
Maximum1000
Range999.98
Interquartile range (IQR)68.04

Descriptive statistics

Standard deviation76.52640254
Coefficient of variation (CV)0.9533318948
Kurtosis10.62438324
Mean80.27257135
Median Absolute Deviation (MAD)29.07
Skewness2.563923541
Sum6936352.89
Variance5856.290285
MonotonicityNot monotonic
2021-12-26T22:32:54.632941image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1127
 
< 0.1%
31.0823
 
< 0.1%
28.822
 
< 0.1%
24.8322
 
< 0.1%
21.3821
 
< 0.1%
42.521
 
< 0.1%
34.121
 
< 0.1%
29.7521
 
< 0.1%
1520
 
< 0.1%
2120
 
< 0.1%
Other values (22385)86192
79.8%
(Missing)21625
 
20.0%
ValueCountFrequency (%)
0.022
< 0.1%
0.041
< 0.1%
0.151
< 0.1%
0.161
< 0.1%
0.191
< 0.1%
0.21
< 0.1%
0.241
< 0.1%
0.251
< 0.1%
0.281
< 0.1%
0.321
< 0.1%
ValueCountFrequency (%)
10001
< 0.1%
999.991
< 0.1%
9951
< 0.1%
992.671
< 0.1%
949.991
< 0.1%
917.771
< 0.1%
916.671
< 0.1%
914.941
< 0.1%
914.641
< 0.1%
894.751
< 0.1%

PM10
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct29575
Distinct (%)45.3%
Missing42706
Missing (%)39.5%
Infinite0
Infinite (%)0.0%
Mean157.9684272
Minimum0.01
Maximum1000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:54.729698image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile30.27
Q170.15
median122.09
Q3208.67
95-th percentile409.386
Maximum1000
Range999.99
Interquartile range (IQR)138.52

Descriptive statistics

Standard deviation123.4186718
Coefficient of variation (CV)0.7812869569
Kurtosis3.383081112
Mean157.9684272
Median Absolute Deviation (MAD)61.83
Skewness1.644048099
Sum10319919.38
Variance15232.16854
MonotonicityNot monotonic
2021-12-26T22:32:54.850181image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9412
 
< 0.1%
71.8812
 
< 0.1%
71.0512
 
< 0.1%
108.4611
 
< 0.1%
55.6611
 
< 0.1%
56.611
 
< 0.1%
41.6210
 
< 0.1%
3110
 
< 0.1%
46.6810
 
< 0.1%
65.8810
 
< 0.1%
Other values (29565)65220
60.4%
(Missing)42706
39.5%
ValueCountFrequency (%)
0.011
 
< 0.1%
0.021
 
< 0.1%
0.031
 
< 0.1%
0.042
< 0.1%
0.061
 
< 0.1%
0.071
 
< 0.1%
0.081
 
< 0.1%
0.092
< 0.1%
0.13
< 0.1%
0.123
< 0.1%
ValueCountFrequency (%)
10001
< 0.1%
9852
< 0.1%
976.771
< 0.1%
960.981
< 0.1%
955.61
< 0.1%
9521
< 0.1%
9421
< 0.1%
938.51
< 0.1%
936.251
< 0.1%
933.051
< 0.1%

NO
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct11963
Distinct (%)13.2%
Missing17106
Missing (%)15.8%
Infinite0
Infinite (%)0.0%
Mean23.12342399
Minimum0.01
Maximum470
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:54.958586image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile1.5
Q14.84
median10.29
Q324.98
95-th percentile93.766
Maximum470
Range469.99
Interquartile range (IQR)20.14

Descriptive statistics

Standard deviation34.49101855
Coefficient of variation (CV)1.49160516
Kurtosis14.05293685
Mean23.12342399
Median Absolute Deviation (MAD)7.02
Skewness3.288711003
Sum2102589.82
Variance1189.63036
MonotonicityNot monotonic
2021-12-26T22:32:55.062579image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.581
 
0.1%
2.8979
 
0.1%
2.9379
 
0.1%
0.7377
 
0.1%
2.4977
 
0.1%
376
 
0.1%
2.8776
 
0.1%
2.8475
 
0.1%
2.9575
 
0.1%
3.9975
 
0.1%
Other values (11953)90159
83.5%
(Missing)17106
 
15.8%
ValueCountFrequency (%)
0.011
 
< 0.1%
0.0210
< 0.1%
0.034
 
< 0.1%
0.042
 
< 0.1%
0.063
 
< 0.1%
0.071
 
< 0.1%
0.083
 
< 0.1%
0.092
 
< 0.1%
0.14
 
< 0.1%
0.114
 
< 0.1%
ValueCountFrequency (%)
4701
< 0.1%
437.851
< 0.1%
436.81
< 0.1%
429.771
< 0.1%
403.941
< 0.1%
390.681
< 0.1%
383.141
< 0.1%
382.441
< 0.1%
374.711
< 0.1%
373.91
< 0.1%

NO2
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct12050
Distinct (%)13.2%
Missing16547
Missing (%)15.3%
Infinite0
Infinite (%)0.0%
Mean35.2407601
Minimum0.01
Maximum448.05
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:55.166619image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile5.3935
Q115.09
median27.21
Q346.93
95-th percentile89.88
Maximum448.05
Range448.04
Interquartile range (IQR)31.84

Descriptive statistics

Standard deviation29.51082713
Coefficient of variation (CV)0.8374060901
Kurtosis11.06061697
Mean35.2407601
Median Absolute Deviation (MAD)14.41
Skewness2.359287053
Sum3224106.66
Variance870.8889177
MonotonicityNot monotonic
2021-12-26T22:32:55.266164image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
943
 
< 0.1%
17.5839
 
< 0.1%
2037
 
< 0.1%
16.0737
 
< 0.1%
17.8236
 
< 0.1%
9.4736
 
< 0.1%
0.236
 
< 0.1%
9.1436
 
< 0.1%
9.2235
 
< 0.1%
13.635
 
< 0.1%
Other values (12040)91118
84.3%
(Missing)16547
 
15.3%
ValueCountFrequency (%)
0.014
< 0.1%
0.027
< 0.1%
0.039
< 0.1%
0.043
 
< 0.1%
0.053
 
< 0.1%
0.063
 
< 0.1%
0.077
< 0.1%
0.086
< 0.1%
0.097
< 0.1%
0.18
< 0.1%
ValueCountFrequency (%)
448.051
< 0.1%
397.771
< 0.1%
397.311
< 0.1%
394.041
< 0.1%
393.081
< 0.1%
369.031
< 0.1%
363.751
< 0.1%
362.731
< 0.1%
362.51
< 0.1%
362.211
< 0.1%

NOx
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct15608
Distinct (%)16.9%
Missing15500
Missing (%)14.3%
Infinite0
Infinite (%)0.0%
Mean41.19505538
Minimum0
Maximum467.63
Zeros4776
Zeros (%)4.4%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:55.359910image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q113.97
median26.66
Q350.5
95-th percentile134.153
Maximum467.63
Range467.63
Interquartile range (IQR)36.53

Descriptive statistics

Standard deviation45.1459756
Coefficient of variation (CV)1.095907632
Kurtosis8.454914527
Mean41.19505538
Median Absolute Deviation (MAD)15.84
Skewness2.53978547
Sum3811984.45
Variance2038.159113
MonotonicityNot monotonic
2021-12-26T22:32:55.457146image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04776
 
4.4%
6.24601
 
0.6%
2.21516
 
0.5%
9.0537
 
< 0.1%
15.7534
 
< 0.1%
16.5634
 
< 0.1%
11.0234
 
< 0.1%
22.0234
 
< 0.1%
17.0134
 
< 0.1%
15.0933
 
< 0.1%
Other values (15598)86402
80.0%
(Missing)15500
 
14.3%
ValueCountFrequency (%)
04776
4.4%
0.017
 
< 0.1%
0.023
 
< 0.1%
0.0314
 
< 0.1%
0.0413
 
< 0.1%
0.054
 
< 0.1%
0.062
 
< 0.1%
0.073
 
< 0.1%
0.082
 
< 0.1%
0.092
 
< 0.1%
ValueCountFrequency (%)
467.631
< 0.1%
453.611
< 0.1%
442.691
< 0.1%
440.311
< 0.1%
434.91
< 0.1%
429.381
< 0.1%
402.271
< 0.1%
399.871
< 0.1%
395.861
< 0.1%
395.331
< 0.1%

NH3
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct9119
Distinct (%)15.2%
Missing48105
Missing (%)44.5%
Infinite0
Infinite (%)0.0%
Mean28.73287519
Minimum0.01
Maximum418.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:55.566496image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile3.85
Q111.9
median23.59
Q338.1375
95-th percentile70.23
Maximum418.9
Range418.89
Interquartile range (IQR)26.2375

Descriptive statistics

Standard deviation24.89779732
Coefficient of variation (CV)0.8665264843
Kurtosis22.77179802
Mean28.73287519
Median Absolute Deviation (MAD)12.6
Skewness3.218919233
Sum1721961.21
Variance619.9003114
MonotonicityNot monotonic
2021-12-26T22:32:55.660224image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13.4249
 
< 0.1%
6.2942
 
< 0.1%
14.6239
 
< 0.1%
6.338
 
< 0.1%
6.2838
 
< 0.1%
6.3136
 
< 0.1%
6.634
 
< 0.1%
10.4233
 
< 0.1%
6.2532
 
< 0.1%
6.3232
 
< 0.1%
Other values (9109)59557
55.1%
(Missing)48105
44.5%
ValueCountFrequency (%)
0.014
 
< 0.1%
0.029
 
< 0.1%
0.031
 
< 0.1%
0.042
 
< 0.1%
0.052
 
< 0.1%
0.066
 
< 0.1%
0.071
 
< 0.1%
0.082
 
< 0.1%
0.091
 
< 0.1%
0.123
< 0.1%
ValueCountFrequency (%)
418.91
< 0.1%
408.581
< 0.1%
379.321
< 0.1%
371.361
< 0.1%
365.681
< 0.1%
361.751
< 0.1%
356.731
< 0.1%
352.891
< 0.1%
349.251
< 0.1%
335.911
< 0.1%

CO
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct2352
Distinct (%)2.5%
Missing12998
Missing (%)12.0%
Infinite0
Infinite (%)0.0%
Mean1.60574934
Minimum0
Maximum175.81
Zeros7280
Zeros (%)6.7%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:55.837314image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.53
median0.91
Q31.45
95-th percentile3.38
Maximum175.81
Range175.81
Interquartile range (IQR)0.92

Descriptive statistics

Standard deviation4.369577753
Coefficient of variation (CV)2.721207878
Kurtosis224.1688928
Mean1.60574934
Median Absolute Deviation (MAD)0.43
Skewness12.19795088
Sum152605.6
Variance19.09320974
MonotonicityNot monotonic
2021-12-26T22:32:55.946665image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
07280
 
6.7%
0.64713
 
0.7%
0.66696
 
0.6%
0.7688
 
0.6%
0.78679
 
0.6%
0.74673
 
0.6%
0.68671
 
0.6%
0.6662
 
0.6%
0.79659
 
0.6%
0.76656
 
0.6%
Other values (2342)81660
75.6%
(Missing)12998
 
12.0%
ValueCountFrequency (%)
07280
6.7%
0.0196
 
0.1%
0.02125
 
0.1%
0.0354
 
< 0.1%
0.0452
 
< 0.1%
0.0573
 
0.1%
0.0654
 
< 0.1%
0.0745
 
< 0.1%
0.0854
 
< 0.1%
0.0959
 
0.1%
ValueCountFrequency (%)
175.811
< 0.1%
145.321
< 0.1%
134.851
< 0.1%
132.471
< 0.1%
132.071
< 0.1%
124.011
< 0.1%
119.681
< 0.1%
119.31
< 0.1%
118.021
< 0.1%
1181
< 0.1%

SO2
Real number (ℝ≥0)

MISSING

Distinct5801
Distinct (%)7.0%
Missing25204
Missing (%)23.3%
Infinite0
Infinite (%)0.0%
Mean12.2576341
Minimum0.01
Maximum195.65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:56.045603image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile2.07
Q15.04
median8.95
Q314.92
95-th percentile32.54
Maximum195.65
Range195.64
Interquartile range (IQR)9.88

Descriptive statistics

Standard deviation12.98472338
Coefficient of variation (CV)1.059317261
Kurtosis34.72063189
Mean12.2576341
Median Absolute Deviation (MAD)4.54
Skewness4.581281421
Sum1015312.09
Variance168.6030412
MonotonicityNot monotonic
2021-12-26T22:32:56.154952image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.484
 
0.1%
3.3880
 
0.1%
479
 
0.1%
5.8678
 
0.1%
3.2875
 
0.1%
3.5175
 
0.1%
3.4875
 
0.1%
6.4274
 
0.1%
5.874
 
0.1%
6.9974
 
0.1%
Other values (5791)82063
76.0%
(Missing)25204
 
23.3%
ValueCountFrequency (%)
0.012
 
< 0.1%
0.021
 
< 0.1%
0.033
 
< 0.1%
0.046
< 0.1%
0.053
 
< 0.1%
0.068
< 0.1%
0.073
 
< 0.1%
0.0810
< 0.1%
0.094
 
< 0.1%
0.16
< 0.1%
ValueCountFrequency (%)
195.651
< 0.1%
193.861
< 0.1%
187.021
< 0.1%
186.081
< 0.1%
182.391
< 0.1%
180.851
< 0.1%
179.181
< 0.1%
178.931
< 0.1%
178.631
< 0.1%
178.581
< 0.1%

O3
Real number (ℝ≥0)

MISSING

Distinct11166
Distinct (%)13.5%
Missing25568
Missing (%)23.7%
Infinite0
Infinite (%)0.0%
Mean38.13483551
Minimum0.01
Maximum963
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:56.245395image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile7.02
Q118.895
median30.84
Q347.14
95-th percentile81
Maximum963
Range962.99
Interquartile range (IQR)28.245

Descriptive statistics

Standard deviation39.12800382
Coefficient of variation (CV)1.026043598
Kurtosis75.83136294
Mean38.13483551
Median Absolute Deviation (MAD)13.49
Skewness6.844483445
Sum3144865.48
Variance1531.000683
MonotonicityNot monotonic
2021-12-26T22:32:56.356365image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16.4835
 
< 0.1%
22.9434
 
< 0.1%
23.633
 
< 0.1%
22.533
 
< 0.1%
34.432
 
< 0.1%
25.6431
 
< 0.1%
19.1231
 
< 0.1%
23.5931
 
< 0.1%
21.430
 
< 0.1%
21.2829
 
< 0.1%
Other values (11156)82148
76.0%
(Missing)25568
 
23.7%
ValueCountFrequency (%)
0.014
 
< 0.1%
0.029
< 0.1%
0.034
 
< 0.1%
0.043
 
< 0.1%
0.054
 
< 0.1%
0.063
 
< 0.1%
0.073
 
< 0.1%
0.082
 
< 0.1%
0.093
 
< 0.1%
0.111
< 0.1%
ValueCountFrequency (%)
9631
< 0.1%
868.21
< 0.1%
819.061
< 0.1%
777.671
< 0.1%
763.121
< 0.1%
7571
< 0.1%
714.751
< 0.1%
705.321
< 0.1%
698.671
< 0.1%
694.961
< 0.1%

Benzene
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct3017
Distinct (%)3.9%
Missing31455
Missing (%)29.1%
Infinite0
Infinite (%)0.0%
Mean3.35802925
Minimum0
Maximum455.03
Zeros12602
Zeros (%)11.7%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:56.450094image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.16
median1.21
Q33.61
95-th percentile11.35
Maximum455.03
Range455.03
Interquartile range (IQR)3.45

Descriptive statistics

Standard deviation11.15623448
Coefficient of variation (CV)3.322256492
Kurtosis695.2028255
Mean3.35802925
Median Absolute Deviation (MAD)1.21
Skewness21.61702016
Sum257157.88
Variance124.4615677
MonotonicityNot monotonic
2021-12-26T22:32:56.543821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
012602
 
11.7%
0.1689
 
0.6%
0.01541
 
0.5%
0.02474
 
0.4%
0.12439
 
0.4%
0.11422
 
0.4%
0.03420
 
0.4%
0.05409
 
0.4%
0.04403
 
0.4%
0.13398
 
0.4%
Other values (3007)59783
55.3%
(Missing)31455
29.1%
ValueCountFrequency (%)
012602
11.7%
0.01541
 
0.5%
0.02474
 
0.4%
0.03420
 
0.4%
0.04403
 
0.4%
0.05409
 
0.4%
0.06374
 
0.3%
0.07321
 
0.3%
0.08337
 
0.3%
0.09337
 
0.3%
ValueCountFrequency (%)
455.031
< 0.1%
454.851
< 0.1%
449.381
< 0.1%
448.591
< 0.1%
445.831
< 0.1%
443.631
< 0.1%
438.011
< 0.1%
435.91
< 0.1%
435.091
< 0.1%
432.941
< 0.1%

Toluene
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct8713
Distinct (%)12.6%
Missing38702
Missing (%)35.8%
Infinite0
Infinite (%)0.0%
Mean15.34539426
Minimum0
Maximum454.85
Zeros10455
Zeros (%)9.7%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:56.653306image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.69
median4.33
Q317.51
95-th percentile64.934
Maximum454.85
Range454.85
Interquartile range (IQR)16.82

Descriptive statistics

Standard deviation29.3485873
Coefficient of variation (CV)1.912533938
Kurtosis33.33721267
Mean15.34539426
Median Absolute Deviation (MAD)4.33
Skewness4.598334818
Sum1063942.22
Variance861.3395766
MonotonicityNot monotonic
2021-12-26T22:32:56.753955image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
010455
 
9.7%
0.01268
 
0.2%
0.02199
 
0.2%
0.07198
 
0.2%
0.03179
 
0.2%
0.04173
 
0.2%
0.1173
 
0.2%
0.08167
 
0.2%
0.06161
 
0.1%
0.09153
 
0.1%
Other values (8703)57207
53.0%
(Missing)38702
35.8%
ValueCountFrequency (%)
010455
9.7%
0.01268
 
0.2%
0.02199
 
0.2%
0.03179
 
0.2%
0.04173
 
0.2%
0.05147
 
0.1%
0.06161
 
0.1%
0.07198
 
0.2%
0.08167
 
0.2%
0.09153
 
0.1%
ValueCountFrequency (%)
454.851
< 0.1%
454.121
< 0.1%
449.141
< 0.1%
448.871
< 0.1%
445.841
< 0.1%
443.631
< 0.1%
437.771
< 0.1%
435.941
< 0.1%
434.921
< 0.1%
433.021
< 0.1%

Xylene
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct1892
Distinct (%)8.3%
Missing85137
Missing (%)78.8%
Infinite0
Infinite (%)0.0%
Mean2.423446153
Minimum0
Maximum170.37
Zeros6083
Zeros (%)5.6%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:56.835726image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.4
Q32.11
95-th percentile10.7315
Maximum170.37
Range170.37
Interquartile range (IQR)2.11

Descriptive statistics

Standard deviation6.472408501
Coefficient of variation (CV)2.670745745
Kurtosis119.5691605
Mean2.423446153
Median Absolute Deviation (MAD)0.4
Skewness8.629738992
Sum55492.07
Variance41.8920718
MonotonicityNot monotonic
2021-12-26T22:32:56.948528image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06083
 
5.6%
0.01500
 
0.5%
0.02423
 
0.4%
0.1327
 
0.3%
0.03297
 
0.3%
0.04241
 
0.2%
0.05188
 
0.2%
0.12183
 
0.2%
0.06180
 
0.2%
0.11168
 
0.2%
Other values (1882)14308
 
13.2%
(Missing)85137
78.8%
ValueCountFrequency (%)
06083
5.6%
0.01500
 
0.5%
0.02423
 
0.4%
0.03297
 
0.3%
0.04241
 
0.2%
0.05188
 
0.2%
0.06180
 
0.2%
0.07151
 
0.1%
0.08148
 
0.1%
0.09135
 
0.1%
ValueCountFrequency (%)
170.371
< 0.1%
137.451
< 0.1%
133.61
< 0.1%
132.971
< 0.1%
129.281
< 0.1%
125.181
< 0.1%
123.291
< 0.1%
116.621
< 0.1%
109.981
< 0.1%
109.231
< 0.1%

AQI
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct930
Distinct (%)1.1%
Missing21010
Missing (%)19.4%
Infinite0
Infinite (%)0.0%
Mean179.7492904
Minimum8
Maximum2049
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size844.1 KiB
2021-12-26T22:32:57.061296image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile47
Q186
median132
Q3254
95-th percentile415
Maximum2049
Range2041
Interquartile range (IQR)168

Descriptive statistics

Standard deviation131.3243389
Coefficient of variation (CV)0.7305972588
Kurtosis8.532544269
Mean179.7492904
Median Absolute Deviation (MAD)62
Skewness1.930087687
Sum15642682
Variance17246.08198
MonotonicityNot monotonic
2021-12-26T22:32:57.155025image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
104615
 
0.6%
102593
 
0.5%
106587
 
0.5%
108561
 
0.5%
100560
 
0.5%
88549
 
0.5%
98547
 
0.5%
90546
 
0.5%
92546
 
0.5%
78545
 
0.5%
Other values (920)81376
75.3%
(Missing)21010
 
19.4%
ValueCountFrequency (%)
81
 
< 0.1%
102
 
< 0.1%
132
 
< 0.1%
147
 
< 0.1%
155
 
< 0.1%
1610
 
< 0.1%
1713
 
< 0.1%
1811
 
< 0.1%
1936
< 0.1%
2049
< 0.1%
ValueCountFrequency (%)
20491
< 0.1%
19171
< 0.1%
18421
< 0.1%
17471
< 0.1%
17191
< 0.1%
16721
< 0.1%
16461
< 0.1%
16301
< 0.1%
16131
< 0.1%
15951
< 0.1%

AQI_Bucket
Categorical

HIGH CORRELATION
MISSING

Distinct6
Distinct (%)< 0.1%
Missing21010
Missing (%)19.4%
Memory size844.1 KiB
Moderate
29417 
Satisfactory
23636 
Very Poor
11762 
Poor
11493 
Good
5510 

Length

Max length12
Median length8
Mean length8.32036771
Min length4

Characters and Unicode

Total characters724080
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowModerate
2nd rowModerate
3rd rowModerate
4th rowModerate
5th rowModerate

Common Values

ValueCountFrequency (%)
Moderate29417
27.2%
Satisfactory23636
21.9%
Very Poor11762
 
10.9%
Poor11493
 
10.6%
Good5510
 
5.1%
Severe5207
 
4.8%
(Missing)21010
19.4%

Length

2021-12-26T22:32:57.354817image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-12-26T22:32:57.401680image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
moderate29417
29.8%
satisfactory23636
23.9%
poor23255
23.5%
very11762
 
11.9%
good5510
 
5.6%
severe5207
 
5.3%

Most occurring characters

ValueCountFrequency (%)
o110583
15.3%
r93277
12.9%
e86217
11.9%
a76689
10.6%
t76689
10.6%
y35398
 
4.9%
d34927
 
4.8%
M29417
 
4.1%
S28843
 
4.0%
c23636
 
3.3%
Other values (8)128404
17.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter613531
84.7%
Uppercase Letter98787
 
13.6%
Space Separator11762
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o110583
18.0%
r93277
15.2%
e86217
14.1%
a76689
12.5%
t76689
12.5%
y35398
 
5.8%
d34927
 
5.7%
c23636
 
3.9%
s23636
 
3.9%
f23636
 
3.9%
Other values (2)28843
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
M29417
29.8%
S28843
29.2%
P23255
23.5%
V11762
 
11.9%
G5510
 
5.6%
Space Separator
ValueCountFrequency (%)
11762
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin712318
98.4%
Common11762
 
1.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o110583
15.5%
r93277
13.1%
e86217
12.1%
a76689
10.8%
t76689
10.8%
y35398
 
5.0%
d34927
 
4.9%
M29417
 
4.1%
S28843
 
4.0%
c23636
 
3.3%
Other values (7)116642
16.4%
Common
ValueCountFrequency (%)
11762
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII724080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o110583
15.3%
r93277
12.9%
e86217
11.9%
a76689
10.6%
t76689
10.6%
y35398
 
4.9%
d34927
 
4.8%
M29417
 
4.1%
S28843
 
4.0%
c23636
 
3.3%
Other values (8)128404
17.7%

Interactions

2021-12-26T22:32:51.649882image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:35.253822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:36.632552image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:38.009171image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:39.446709image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:40.771386image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:42.172955image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:43.441293image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:44.910266image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:46.250299image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:47.695025image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:48.979787image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.325378image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:51.750534image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:35.375613image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:36.816556image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:38.128448image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:39.544456image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:40.875665image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:42.272709image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:43.545015image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:45.000586image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:46.346821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:47.788491image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:49.075146image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.414456image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:51.868591image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:35.465867image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:36.911490image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:38.238155image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:39.656198image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:40.972847image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:42.378035image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:43.653724image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:45.104750image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:46.445000image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:47.882511image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:49.169341image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.501665image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:51.974704image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:35.576989image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:37.000570image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:38.340989image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:39.743728image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:41.078781image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:42.470234image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:43.763431image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:45.208871image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:46.543631image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:47.972791image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:49.340905image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.595973image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:52.083936image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:35.682741image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:37.118619image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:38.444712image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:39.847898image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:41.168695image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:42.573065image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:43.884109image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:45.315713image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:46.639415image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:48.076993image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:49.451892image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.692088image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:52.184605image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:35.782474image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:37.208952image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:38.535323image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:39.945014image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:41.272635image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:42.665856image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:43.979686image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:45.410302image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:46.817916image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:48.169763image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:49.543486image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.782635image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:52.285271image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:35.896423image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:37.321076image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:38.632615image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:40.054406image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:41.374493image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:42.756104image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:44.069995image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:45.514448image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:46.925472image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:48.271554image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:49.646332image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.871654image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:52.396371image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:36.001811image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:37.410264image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:38.736658image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:40.153419image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:41.481504image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:42.858104image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:44.188021image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:45.625591image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:47.028351image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:48.371287image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:49.751141image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.966537image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:52.500620image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:36.117849image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:37.530999image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:38.847811image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:40.268127image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:41.589215image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:42.956008image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:44.396119image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:45.743620image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:47.157054image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:48.478002image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:49.858833image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:51.067684image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:52.597791image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:36.215844image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:37.634720image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:38.945012image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:40.365866image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:41.684959image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:43.047633image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:44.499841image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:45.840797image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:47.265016image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:48.589703image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:49.951917image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:51.171887image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:52.706801image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:36.312675image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:37.725879image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:39.035263image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:40.468592image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:41.863454image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:43.139146image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:44.597447image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:45.938071image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:47.354741image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:48.688440image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.043739image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:51.265295image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:52.797526image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:36.399578image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:37.808592image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:39.123393image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:40.564339image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:41.959258image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:43.225417image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:44.687946image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:46.027516image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:47.442404image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:48.777201image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.132133image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:51.356587image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:52.910289image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:36.500682image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:37.903360image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:39.323488image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:40.667257image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:42.070230image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:43.328889image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:44.800892image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:46.138599image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:47.556079image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:48.880748image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:50.232855image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-12-26T22:32:51.449943image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-12-26T22:32:57.481413image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-12-26T22:32:57.622006image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-12-26T22:32:57.760951image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-12-26T22:32:57.898231image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-12-26T22:32:53.096084image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-12-26T22:32:53.384223image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-12-26T22:32:53.774784image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-12-26T22:32:54.031353image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

StationIdDatePM2.5PM10NONO2NOxNH3COSO2O3BenzeneTolueneXyleneAQIAQI_Bucket
0AP0012017-11-2471.36115.751.7520.6512.4012.190.1010.76109.260.175.920.10NaNNaN
1AP0012017-11-2581.40124.501.4420.5012.0810.720.1215.24127.090.206.500.06184.0Moderate
2AP0012017-11-2678.32129.061.2626.0014.8510.280.1426.96117.440.227.950.08197.0Moderate
3AP0012017-11-2788.76135.326.6030.8521.7712.910.1133.59111.810.297.630.12198.0Moderate
4AP0012017-11-2864.18104.092.5628.0717.0111.420.0919.00138.180.175.020.07188.0Moderate
5AP0012017-11-2972.47114.845.2323.2016.5912.250.1610.55109.740.214.710.08173.0Moderate
6AP0012017-11-3069.80114.864.6920.1714.5410.950.1214.07118.090.163.520.06165.0Moderate
7AP0012017-12-0173.96113.564.5819.2913.9710.950.1013.90123.800.172.850.04191.0Moderate
8AP0012017-12-0289.90140.207.7126.1919.8713.120.1019.37128.730.252.790.07191.0Moderate
9AP0012017-12-0387.14130.520.9721.3112.1214.360.1511.41114.800.233.820.04227.0Poor

Last rows

StationIdDatePM2.5PM10NONO2NOxNH3COSO2O3BenzeneTolueneXyleneAQIAQI_Bucket
108025WB0132020-06-2215.1030.982.5918.0420.6330.340.671.5025.841.288.32NaN38.0Good
108026WB0132020-06-2319.4842.373.0620.9423.9932.530.701.7228.211.655.93NaN44.0Good
108027WB0132020-06-2420.0546.633.7815.2819.0618.890.667.0841.561.147.24NaN59.0Satisfactory
108028WB0132020-06-2517.0339.643.2311.4214.6518.980.5711.3931.760.796.85NaN56.0Satisfactory
108029WB0132020-06-269.7919.8723.5116.5040.0225.090.6610.3430.190.936.37NaN50.0Good
108030WB0132020-06-278.6516.46NaNNaNNaNNaN0.694.3630.591.327.26NaN50.0Good
108031WB0132020-06-2811.8018.47NaNNaNNaNNaN0.683.4938.951.427.92NaN65.0Satisfactory
108032WB0132020-06-2918.6032.2613.65200.87214.2011.400.785.1238.173.528.64NaN63.0Satisfactory
108033WB0132020-06-3016.0739.307.5629.1336.6929.260.695.8829.641.868.40NaN57.0Satisfactory
108034WB0132020-07-0110.5036.507.7822.5030.2527.230.582.8013.101.317.39NaN59.0Satisfactory